Hourly Analysis

Hour 4 Analysis

The linear regression model for hour 4 includes several parameters such as lag_1_production, dlwrf_surface, tmp_surface, hourly_cloud_average, special_period, and trend_hour_4. This model aims to capture the production pattern at 4 AM, which generally has limited data due to minimal solar activity at this time.

#Production data for hour 4
hour_4_data <- all_data[all_data$hour == 4, ]

# Plot production for hour 4
ggplot(hour_4_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 4",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Filter data for hour 4
hour_4_data <- all_data[all_data$hour == 4, ]
hour_4_data$trend_hour_4 <- 1:nrow(hour_4_data)
hour_4_data[, lag_1_production := shift(production,1)]
hour_4_data[,lag_1_diff:=production-lag_1_production]
hour_4_data <- hour_4_data[!is.na(lag_1_production)]
# Fit linear regression model for hour 4
lm_hour_4 <- lm(production ~+lag_1_production +DLWRF_surface+TMP_surface+hourly_cloud_average+special_period+trend_hour_4+month, data = hour_4_data)
# Summarize the model
summary(lm_hour_4)
## 
## Call:
## lm(formula = production ~ +lag_1_production + DLWRF_surface + 
##     TMP_surface + hourly_cloud_average + special_period + trend_hour_4 + 
##     month, data = hour_4_data)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.119491 -0.001286 -0.000220  0.000614  0.173167 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           2.774e-02  5.524e-02   0.502  0.61569    
## lag_1_production      6.867e-01  2.490e-02  27.577  < 2e-16 ***
## DLWRF_surface         2.489e-05  3.890e-05   0.640  0.52241    
## TMP_surface          -1.206e-04  2.290e-04  -0.527  0.59851    
## hourly_cloud_average -3.182e-05  2.898e-05  -1.098  0.27252    
## special_period       -6.329e-03  1.534e-03  -4.125 4.09e-05 ***
## trend_hour_4          6.630e-07  2.222e-06   0.298  0.76548    
## monthAug              1.974e-03  2.669e-03   0.740  0.45964    
## monthDec             -3.856e-04  2.204e-03  -0.175  0.86116    
## monthFeb             -2.568e-04  2.290e-03  -0.112  0.91074    
## monthJan             -1.861e-04  2.154e-03  -0.086  0.93119    
## monthJul              7.587e-03  2.557e-03   2.968  0.00309 ** 
## monthJun              1.589e-02  2.620e-03   6.065 2.01e-09 ***
## monthMar             -1.008e-04  2.027e-03  -0.050  0.96035    
## monthMay             -2.629e-04  1.990e-03  -0.132  0.89493    
## monthNov              1.310e-03  2.153e-03   0.608  0.54324    
## monthOct              2.622e-03  2.220e-03   1.181  0.23796    
## monthSep              2.288e-03  2.389e-03   0.958  0.33855    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0117 on 827 degrees of freedom
## Multiple R-squared:  0.6841, Adjusted R-squared:  0.6776 
## F-statistic: 105.4 on 17 and 827 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_4)

## 
##  Breusch-Godfrey test for serial correlation of order up to 21
## 
## data:  Residuals
## LM test = 422.17, df = 21, p-value < 2.2e-16

Model Summary

Coefficients and Significance:

  • Lag 1 Production: Highly significant with a positive impact, indicating that the production from the previous hour heavily influences the current hour’s production.

  • DLWRF Surface, TMP Surface, Hourly Cloud Average: These parameters show lower significance, indicating minimal impact on the prediction for this hour.

  • Special Period: Significant and negatively impacting, indicating that during the special period, the production at hour 4 is lower.

  • Monthly Effects: Certain months like June and July have significant coefficients, indicating seasonal variations in production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.6806, suggesting that the model explains about 68.06% of the variability in production.

  • Adjusted R-squared: 0.6742, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

he WMAPE for hour 4 was found to be 1407.11%, indicating a inaccuracy model.

Visualization and Interpretation

Hourly Production Data for Hour 4:

  • The plot shows limited production data, mostly clustered around specific periods, indicating times when solar production was active.

Residuals Analysis:

  • Top Plot (Residuals over time): Indicates potential periods of higher residuals, suggesting times when the model predictions were less accurate.

  • ACF Plot (Autocorrelation of Residuals): Shows autocorrelation in the residuals, which might suggest that some patterns in the data are not fully captured by the model.

  • Histogram (Distribution of Residuals): Indicates that residuals are centered around zero but have some deviation, highlighting the areas where predictions might be off.

Hour 5 Analysis

For hour 5, a similar approach to the one used for hour 4 was applied. The linear regression model included parameters like lag_1_production, dlwrf_surface, is.ramadan, special_period, trend_hour_5, and interactions between month and hourly_max_t.

#Production data for hour 5
hour_5_data <- all_data[all_data$hour == 5, ]

# Plot production for hour 4
ggplot(hour_5_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 5",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Filter data for hour 5
hour_5_data <- all_data[all_data$hour == 5, ]
hour_5_data <- hour_5_data[,-c(2)]
hour_5_data[, lag_1_production := shift(production,1)]
hour_5_data[,lag_1_diff:=production-lag_1_production]
hour_5_data <- hour_5_data[!is.na(lag_1_production)]
hour_5_data$trend_hour_5 <- 1:nrow(hour_5_data)
# Fit linear regression model for hour 5
lm_hour_5 <- lm(production ~+lag_1_production+DLWRF_surface+is.ramadan+special_period+trend_hour_5 +month*hourly_max_t, data = hour_5_data)
# Summarize the model
summary(lm_hour_5)
## 
## Call:
## lm(formula = production ~ +lag_1_production + DLWRF_surface + 
##     is.ramadan + special_period + trend_hour_5 + month * hourly_max_t, 
##     data = hour_5_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.81132 -0.02527 -0.00681  0.01221  1.77736 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -1.486e+00  1.357e+00  -1.095   0.2739    
## lag_1_production       4.148e-01  3.170e-02  13.084  < 2e-16 ***
## DLWRF_surface          1.603e-04  2.449e-04   0.655   0.5129    
## is.ramadan            -2.878e-02  2.062e-02  -1.396   0.1632    
## special_period        -1.252e-01  1.648e-02  -7.596 8.35e-14 ***
## trend_hour_5           5.244e-05  2.256e-05   2.324   0.0204 *  
## monthAug              -8.013e-01  2.875e+00  -0.279   0.7805    
## monthDec               2.008e+00  1.932e+00   1.039   0.2989    
## monthFeb               1.969e+00  1.454e+00   1.354   0.1761    
## monthJan               2.122e+00  1.469e+00   1.444   0.1491    
## monthJul              -3.006e+00  2.497e+00  -1.204   0.2291    
## monthJun              -5.780e+00  3.416e+00  -1.692   0.0910 .  
## monthMar               1.831e+00  1.441e+00   1.271   0.2042    
## monthMay               1.819e+00  2.071e+00   0.878   0.3800    
## monthNov               2.996e+00  1.954e+00   1.534   0.1255    
## monthOct               3.060e+00  1.987e+00   1.540   0.1239    
## monthSep               1.814e+00  2.040e+00   0.889   0.3741    
## hourly_max_t           5.309e-03  4.954e-03   1.072   0.2842    
## monthAug:hourly_max_t  2.920e-03  1.009e-02   0.290   0.7723    
## monthDec:hourly_max_t -7.475e-03  7.017e-03  -1.065   0.2870    
## monthFeb:hourly_max_t -7.337e-03  5.268e-03  -1.393   0.1641    
## monthJan:hourly_max_t -7.891e-03  5.322e-03  -1.483   0.1385    
## monthJul:hourly_max_t  1.097e-02  8.822e-03   1.244   0.2140    
## monthJun:hourly_max_t  2.076e-02  1.207e-02   1.720   0.0858 .  
## monthMar:hourly_max_t -6.767e-03  5.201e-03  -1.301   0.1936    
## monthMay:hourly_max_t -6.724e-03  7.421e-03  -0.906   0.3651    
## monthNov:hourly_max_t -1.091e-02  7.053e-03  -1.547   0.1223    
## monthOct:hourly_max_t -1.094e-02  7.112e-03  -1.538   0.1243    
## monthSep:hourly_max_t -6.377e-03  7.256e-03  -0.879   0.3798    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1175 on 816 degrees of freedom
## Multiple R-squared:  0.5733, Adjusted R-squared:  0.5587 
## F-statistic: 39.16 on 28 and 816 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_5)

## 
##  Breusch-Godfrey test for serial correlation of order up to 32
## 
## data:  Residuals
## LM test = 198.33, df = 32, p-value < 2.2e-16
plot(lm_hour_5)

Model Summary

Coefficients and Significance:

  • Lag 1 Production: Highly significant with a positive impact, indicating that the production from the previous hour continues to influence the current hour’s production significantly.

  • DLWRF Surface: Shows lower significance, indicating a minimal direct impact on the predictions for this hour.

  • Is Ramadan: Significant and negatively impacting, suggesting lower production during Ramadan.

  • Special Period: Highly significant and negatively impacting, indicating lower production during this period.

  • Trend Hour 5: Significant and positively impacting, suggesting a gradual increase in production over time.

Monthly Effects:

  • Some months showed significant interactions with hourly_max_t, highlighting seasonal variations in production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.6002, suggesting that the model explains about 60.02% of the variability in production.

  • Adjusted R-squared: 0.5868, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

he WMAPE for hour 5 was found to be 154.10%, indicating a inaccuracy model.

Visualization and Interpretation

Hourly Production Data for Hour 5:

  • The data for hour 5 is more populated than hour 4, indicating a clearer pattern of production as the sun rises.

Residuals Analysis:

  • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

  • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

  • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 6 Analysis

For hour 6, the linear regression model included parameters such as lag_1_production, uswrf_top_of_atmosphere, wday, hourly_cloud_average, is.ramadan, special_period, is.religousday, and interactions between month and hourly_max_t.

hour_6_data <- all_data[all_data$hour == 6, ]

# Plot production for hour 6
ggplot(hour_6_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 6",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Filter data for hour 6
hour_6_data <- all_data[all_data$hour == 6, ]
hour_6_data <- hour_6_data[, -c(2)]
# Convert the data frame to a data.table
setDT(hour_6_data)
# Create a lagged variable for production with a lag of 1 period
hour_6_data[, lag_1_production := shift(production,1)]
hour_6_data[,lag_1_diff:=production-lag_1_production]
hour_6_data <- hour_6_data[!is.na(lag_1_production)]
hour_6_data$trend_hour_6 <- 1:nrow(hour_6_data)
# Fit linear regression model for hour 6
lm_hour_6 <- lm(production~+lag_1_production+USWRF_top_of_atmosphere+wday+ hourly_cloud_average+is.ramadan+special_period+is.religousday+month*hourly_max_t, data = hour_6_data) 
#lm_hour_6 <- lm(production ~ +uswrf_top_of_atmosphere + is.ramadan +is.weekend+ is.religousday +is.publicholiday + month*hourly_max_t , data = hour_6_data)
# Summarize the model
summary(lm_hour_6)
## 
## Call:
## lm(formula = production ~ +lag_1_production + USWRF_top_of_atmosphere + 
##     wday + hourly_cloud_average + is.ramadan + special_period + 
##     is.religousday + month * hourly_max_t, data = hour_6_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5352 -0.2127 -0.0409  0.1283  4.6541 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -1.329e+01  6.793e+00  -1.956   0.0508 .  
## lag_1_production         4.110e-01  3.288e-02  12.501   <2e-16 ***
## USWRF_top_of_atmosphere  5.117e-02  3.858e-02   1.326   0.1851    
## wdayMon                  6.914e-02  7.718e-02   0.896   0.3707    
## wdaySat                  1.176e-01  7.711e-02   1.526   0.1275    
## wdaySun                  7.834e-04  7.685e-02   0.010   0.9919    
## wdayThu                  4.277e-02  7.722e-02   0.554   0.5798    
## wdayTue                 -2.025e-02  7.721e-02  -0.262   0.7932    
## wdayWed                  1.785e-01  7.753e-02   2.303   0.0216 *  
## hourly_cloud_average    -2.360e-03  9.317e-04  -2.533   0.0115 *  
## is.ramadan              -7.392e-02  1.034e-01  -0.715   0.4750    
## special_period          -7.734e-01  7.800e-02  -9.915   <2e-16 ***
## is.religousday           1.333e-01  1.297e-01   1.027   0.3046    
## monthAug                 1.157e+01  1.445e+01   0.800   0.4238    
## monthDec                 8.033e+00  9.868e+00   0.814   0.4159    
## monthFeb                 1.166e+01  7.449e+00   1.565   0.1180    
## monthJan                 1.184e+01  7.522e+00   1.573   0.1160    
## monthJul                 1.323e+01  1.291e+01   1.025   0.3056    
## monthJun                 2.234e+01  1.765e+01   1.266   0.2060    
## monthMar                 1.069e+01  7.422e+00   1.441   0.1501    
## monthMay                 1.781e+01  1.075e+01   1.657   0.0979 .  
## monthNov                 1.516e+01  1.001e+01   1.515   0.1302    
## monthOct                 2.115e+01  1.005e+01   2.105   0.0356 *  
## monthSep                 1.213e+01  1.038e+01   1.169   0.2428    
## hourly_max_t             4.981e-02  2.449e-02   2.034   0.0423 *  
## monthAug:hourly_max_t   -4.072e-02  5.079e-02  -0.802   0.4230    
## monthDec:hourly_max_t   -3.044e-02  3.587e-02  -0.849   0.3963    
## monthFeb:hourly_max_t   -4.352e-02  2.701e-02  -1.612   0.1074    
## monthJan:hourly_max_t   -4.425e-02  2.727e-02  -1.623   0.1051    
## monthJul:hourly_max_t   -4.523e-02  4.558e-02  -0.992   0.3213    
## monthJun:hourly_max_t   -7.796e-02  6.211e-02  -1.255   0.2098    
## monthMar:hourly_max_t   -3.931e-02  2.682e-02  -1.466   0.1430    
## monthMay:hourly_max_t   -6.496e-02  3.864e-02  -1.681   0.0931 .  
## monthNov:hourly_max_t   -5.563e-02  3.617e-02  -1.538   0.1245    
## monthOct:hourly_max_t   -7.557e-02  3.600e-02  -2.099   0.0361 *  
## monthSep:hourly_max_t   -4.255e-02  3.699e-02  -1.150   0.2504    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5959 on 809 degrees of freedom
## Multiple R-squared:  0.6395, Adjusted R-squared:  0.6239 
## F-statistic:    41 on 35 and 809 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_6)

## 
##  Breusch-Godfrey test for serial correlation of order up to 39
## 
## data:  Residuals
## LM test = 147.51, df = 39, p-value = 1.595e-14
plot(lm_hour_6)

Model Summary

Coefficients and Significance:

  • Lag 1 Production: Highly significant with a positive impact, indicating that the production from the previous hour continues to influence the current hour’s production significantly.

  • USWRF Top of Atmosphere: Shows lower significance, indicating a minimal direct impact on the predictions for this hour.

  • Wday: Certain days of the week show borderline significance, suggesting potential weekly patterns in production.

  • Hourly Cloud Average: Significant and negatively impacting, indicating that higher cloud coverage reduces production.

  • Is Ramadan: Borderline significant and negatively impacting, suggesting lower production during Ramadan.

  • Special Period: Highly significant and negatively impacting, indicating lower production during this period.

  • Monthly Effects: Some months showed significant interactions with hourly_max_t, highlighting seasonal variations in production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.6422, suggesting that the model explains about 64.22% of the variability in production.

  • Adjusted R-squared: 0.6271, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

he WMAPE for hour 6 was found to be 66.29%, indicating a reasonably accurate model.

Visualization and Interpretation

Hourly Production Data for Hour 6:

  • The data for hour 6 is more populated than earlier hours, indicating a clearer pattern of production as the sun rises and solar activity increases.

Residuals Analysis:

  • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

  • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

  • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 7 Analysis

For hour 7, the linear regression model included parameters such as lag_1_production, trend_hour_7, special_period, dlwrf_surface, hourly_cloud_average, month, and hourly_max_t.

# Plot production for hour 7
ggplot(hour_7_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 7",
       x = "Date",
       y = "Production") +
  theme_minimal()

library(data.table)

# Filter data for hour 7 and remove the hour column
hour_7_data <- all_data[all_data$hour == 7, ]
hour_7_data <- hour_7_data[, -c(2)]

# Create a trend variable for hour 7
hour_7_data$trend_hour_7 <- 1:nrow(hour_7_data)

# Convert the data frame to a data.table
setDT(hour_7_data)

# Create a lagged variable for production with a lag of 1 period
hour_7_data[, lag_1_production := shift(production,1)]
hour_7_data[,lag_1_diff:=production-lag_1_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_7_data <- hour_7_data[!is.na(lag_1_production)]

# Fit linear regression model for hour 7 including the lagged variable
lm_hour_7 <- lm(production ~+lag_1_production+trend_hour_7 + special_period + DLWRF_surface + hourly_cloud_average + month+hourly_max_t,data=hour_7_data)
summary(lm_hour_7)
## 
## Call:
## lm(formula = production ~ +lag_1_production + trend_hour_7 + 
##     special_period + DLWRF_surface + hourly_cloud_average + month + 
##     hourly_max_t, data = hour_7_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.4967 -0.6421  0.0727  0.7645  8.9775 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          -1.746e+01  5.775e+00  -3.024 0.002573 ** 
## lag_1_production      3.037e-01  3.142e-02   9.665  < 2e-16 ***
## trend_hour_7          4.775e-04  2.486e-04   1.921 0.055064 .  
## special_period       -6.060e-01  1.710e-01  -3.543 0.000418 ***
## DLWRF_surface        -1.576e-02  4.230e-03  -3.725 0.000209 ***
## hourly_cloud_average -1.027e-02  3.479e-03  -2.951 0.003256 ** 
## monthAug             -6.492e-01  3.159e-01  -2.055 0.040195 *  
## monthDec             -2.015e+00  2.810e-01  -7.171 1.65e-12 ***
## monthFeb             -1.207e+00  2.819e-01  -4.280 2.09e-05 ***
## monthJan             -2.004e+00  2.745e-01  -7.302 6.66e-13 ***
## monthJul              3.540e-01  2.999e-01   1.180 0.238195    
## monthJun              4.946e-01  2.870e-01   1.724 0.085146 .  
## monthMar             -4.739e-01  2.467e-01  -1.921 0.055093 .  
## monthMay             -7.938e-02  2.426e-01  -0.327 0.743572    
## monthNov             -1.088e+00  2.643e-01  -4.118 4.20e-05 ***
## monthOct              1.606e-01  2.652e-01   0.605 0.545018    
## monthSep              1.085e-01  2.786e-01   0.390 0.696995    
## hourly_max_t          8.783e-02  2.385e-02   3.683 0.000245 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.366 on 827 degrees of freedom
## Multiple R-squared:  0.6479, Adjusted R-squared:  0.6407 
## F-statistic: 89.52 on 17 and 827 DF,  p-value: < 2.2e-16
# Check residuals
library(forecast)
checkresiduals(lm_hour_7)

## 
##  Breusch-Godfrey test for serial correlation of order up to 21
## 
## data:  Residuals
## LM test = 57.848, df = 21, p-value = 2.689e-05
# Plot the model
plot(lm_hour_7)

Model Summary

Coefficients and Significance:

  • Lag 1 Production: Highly significant with a positive impact, indicating that the production from the previous hour continues to influence the current hour’s production significantly.

  • Trend Hour 7: Significant and positively impacting, suggesting a gradual increase in production over time.

  • Special Period: Highly significant and negatively impacting, indicating lower production during this period.

  • DLWRF Surface: Significant and negatively impacting, indicating that downward longwave radiation reduces production.

  • Hourly Cloud Average: Significant and negatively impacting, indicating that higher cloud coverage reduces production.

  • Monthly Effects: Several months showed significant effects, highlighting seasonal variations in production.

  • Hourly Max T: Significant and positively impacting, indicating that higher temperatures during specific months can increase production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.6439, suggesting that the model explains about 64.39% of the variability in production.

  • Adjusted R-squared: 0.6367, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

he WMAPE for hour 7 was found to be 29.76%, indicating a reasonably accurate model.

Visualization and Interpretation

Hourly Production Data for Hour 7:

  • The data for hour 7 shows significant improvement in model performance with a clearer pattern of production as the sun rises higher.

Residuals Analysis:

  • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

  • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

  • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 8 Analysis

For hour 8, the linear regression model included significant predictors such as lag_1_production, trend_hour_8, csnow_surface, dlwrf_surface, hourly_cloud_average, is.ramadan, and interactions between month and hourly_max_t.

For hour 8, although there are still higher instances of lags in the residual ACF plot, no additional lag parameter was added to prevent multicollinearity.

Similar assumption was also followed with other hours.

# Plot production for hour 8
ggplot(hour_8_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 8",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Filter data for hour 8
hour_8_data <- all_data[all_data$hour == 8, ]
hour_8_data <- hour_8_data[,-c(2)]
hour_8_data$trend_hour_8 <- 1:nrow(hour_8_data)
# Create a lagged variable for production with a lag of 1 period
hour_8_data[, lag_1_production := shift(production,1)]
hour_8_data[,lag_1_diff:=production-lag_1_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_8_data <- hour_8_data[!is.na(lag_1_production)]
#Fit linear regression model for hour 8
lm_hour_8 <- lm(production ~ +lag_1_production+trend_hour_8+ CSNOW_surface+DLWRF_surface+hourly_cloud_average+is.ramadan+month*hourly_max_t, data = hour_8_data)
# Summarize the model
summary(lm_hour_8)
## 
## Call:
## lm(formula = production ~ +lag_1_production + trend_hour_8 + 
##     CSNOW_surface + DLWRF_surface + hourly_cloud_average + is.ramadan + 
##     month * hourly_max_t, data = hour_8_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.7023 -0.9553  0.1940  1.0672  4.7857 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -3.779e+00  1.668e+01  -0.227 0.820830    
## lag_1_production       1.923e-01  2.934e-02   6.552 1.00e-10 ***
## trend_hour_8           1.017e-03  2.879e-04   3.532 0.000436 ***
## CSNOW_surface         -1.251e+00  4.430e-01  -2.823 0.004871 ** 
## DLWRF_surface         -2.500e-02  5.239e-03  -4.771 2.17e-06 ***
## hourly_cloud_average  -2.636e-02  4.731e-03  -5.572 3.42e-08 ***
## is.ramadan            -4.809e-01  3.227e-01  -1.490 0.136525    
## monthAug               9.521e+00  3.833e+01   0.248 0.803901    
## monthDec              -5.832e+01  2.807e+01  -2.078 0.038046 *  
## monthFeb              -2.835e+01  1.920e+01  -1.476 0.140240    
## monthJan              -3.938e+01  1.973e+01  -1.996 0.046273 *  
## monthJul              -2.735e+00  3.322e+01  -0.082 0.934418    
## monthJun              -2.855e+00  3.410e+01  -0.084 0.933309    
## monthMar              -2.354e+01  1.966e+01  -1.197 0.231663    
## monthMay               3.804e+00  2.449e+01   0.155 0.876610    
## monthNov              -1.300e+01  2.601e+01  -0.500 0.617249    
## monthOct              -1.184e+01  2.713e+01  -0.437 0.662492    
## monthSep               3.274e+01  2.576e+01   1.271 0.204133    
## hourly_max_t           5.661e-02  5.988e-02   0.945 0.344725    
## monthAug:hourly_max_t -3.199e-02  1.297e-01  -0.247 0.805149    
## monthDec:hourly_max_t  2.065e-01  1.016e-01   2.033 0.042423 *  
## monthFeb:hourly_max_t  1.007e-01  6.877e-02   1.464 0.143456    
## monthJan:hourly_max_t  1.372e-01  7.077e-02   1.938 0.052980 .  
## monthJul:hourly_max_t  1.088e-02  1.137e-01   0.096 0.923779    
## monthJun:hourly_max_t  1.093e-02  1.173e-01   0.093 0.925749    
## monthMar:hourly_max_t  8.580e-02  7.003e-02   1.225 0.220891    
## monthMay:hourly_max_t -1.244e-02  8.562e-02  -0.145 0.884487    
## monthNov:hourly_max_t  4.300e-02  9.308e-02   0.462 0.644230    
## monthOct:hourly_max_t  4.253e-02  9.566e-02   0.445 0.656739    
## monthSep:hourly_max_t -1.123e-01  8.950e-02  -1.254 0.210115    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.798 on 815 degrees of freedom
## Multiple R-squared:  0.5968, Adjusted R-squared:  0.5825 
## F-statistic:  41.6 on 29 and 815 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_8)

## 
##  Breusch-Godfrey test for serial correlation of order up to 33
## 
## data:  Residuals
## LM test = 66.346, df = 33, p-value = 0.000508
plot(lm_hour_8)

Model Summary:

  • Lag 1 Production: Highly significant with a positive impact, indicating the previous hour’s production significantly influences the current hour’s production.

  • Trend Hour 8: Significant with a positive impact, suggesting a gradual increase in production over time.

  • Csnow Surface: Significant with a negative impact, indicating snowfall negatively affects production.

  • DLWRF Surface: Highly significant with a negative impact, indicating that downward longwave radiation at the surface negatively affects production.

  • Hourly Cloud Average: Significant with a negative impact, indicating cloud cover reduces production.

  • Is Ramadan: Significant with a negative impact, suggesting lower production during Ramadan.

Monthly Effects:

  • Some months showed significant interactions with hourly_max_t, highlighting seasonal variations in production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in residuals or prediction errors.

  • Multiple R-squared: 0.5892, suggesting that the model explains about 58.92% of the variability in production.

  • Adjusted R-squared: 0.5749, slightly lower than Multiple R-squared, accounting for the number of predictors.

  • F-statistic: Significant, indicating the model provides a better fit than a model with no predictors.

The WMAPE for hour 8 was found to be 22.69%, indicating a reasonably accurate model.

Visualization and Interpretation:

Hourly Production Data for Hour 8:

  • The data for hour 8 shows a clear pattern of production, reflecting the increasing sunlight during this time of the morning.

Residuals Analysis:

  • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when model predictions were less accurate.

  • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting not all patterns in the data are fully captured by the model.

  • Histogram (Distribution of Residuals): Centered around zero but with deviations, indicating areas where predictions might be off.

Hour 9 Analysis

For hour 9, a linear regression model was created, including variables such as lag_1_production, special_period, dswrf_surface, csnow_surface, dlwrf_surface, is.nationalday, uswrf_surface, hourly_cloud_average, and interactions between month and hourly_max_t.

# Plot production for hour 9
ggplot(hour_9_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 9",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Filter data for hour 9
hour_9_data <- all_data[all_data$hour == 9, ]
hour_9_data <- hour_9_data[,-c(2)]
hour_9_data$trend_hour_9 <- 1:nrow(hour_9_data)
hour_9_data[, lag_1_production := shift(production,1)]
hour_9_data[,lag_1_diff:=production-lag_1_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_9_data <- hour_9_data[!is.na(lag_1_production)]
# Fit linear regression model for hour 9
lm_hour_9 <- lm(production ~+lag_1_production+special_period+DSWRF_surface+CSNOW_surface+DLWRF_surface+is.nationalday+USWRF_surface+hourly_cloud_average+month*hourly_max_t, data = hour_9_data)
# Summarize the model
summary(lm_hour_9)
## 
## Call:
## lm(formula = production ~ +lag_1_production + special_period + 
##     DSWRF_surface + CSNOW_surface + DLWRF_surface + is.nationalday + 
##     USWRF_surface + hourly_cloud_average + month * hourly_max_t, 
##     data = hour_9_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.4226 -0.9775  0.2714  1.1875  5.3309 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -33.320259  19.054706  -1.749 0.080727 .  
## lag_1_production        0.103092   0.028440   3.625 0.000307 ***
## special_period         -0.312502   0.211754  -1.476 0.140391    
## DSWRF_surface          -0.033438   0.015008  -2.228 0.026153 *  
## CSNOW_surface          -1.403326   0.521675  -2.690 0.007291 ** 
## DLWRF_surface          -0.035314   0.006185  -5.710 1.58e-08 ***
## is.nationalday          1.304794   0.542390   2.406 0.016367 *  
## USWRF_surface           0.097334   0.044588   2.183 0.029322 *  
## hourly_cloud_average   -0.030182   0.005345  -5.647 2.26e-08 ***
## monthAug                9.616202  41.496349   0.232 0.816801    
## monthDec              -99.793066  32.001599  -3.118 0.001883 ** 
## monthFeb              -13.875015  19.317473  -0.718 0.472802    
## monthJan              -69.201192  20.682438  -3.346 0.000858 ***
## monthJul               31.871323  32.065862   0.994 0.320551    
## monthJun              -15.514790  32.084706  -0.484 0.628830    
## monthMar              -28.662861  20.496925  -1.398 0.162375    
## monthMay              -18.575056  23.384689  -0.794 0.427238    
## monthNov               -4.709414  24.821347  -0.190 0.849566    
## monthOct              -28.069339  29.292627  -0.958 0.338227    
## monthSep               32.201308  25.609441   1.257 0.208970    
## hourly_max_t            0.179545   0.069815   2.572 0.010296 *  
## monthAug:hourly_max_t  -0.035523   0.137363  -0.259 0.796001    
## monthDec:hourly_max_t   0.360259   0.114939   3.134 0.001784 ** 
## monthFeb:hourly_max_t   0.052070   0.068715   0.758 0.448810    
## monthJan:hourly_max_t   0.250637   0.073880   3.393 0.000726 ***
## monthJul:hourly_max_t  -0.104955   0.107968  -0.972 0.331293    
## monthJun:hourly_max_t   0.054977   0.108866   0.505 0.613698    
## monthMar:hourly_max_t   0.104192   0.072343   1.440 0.150179    
## monthMay:hourly_max_t   0.064481   0.080595   0.800 0.423909    
## monthNov:hourly_max_t   0.015062   0.087706   0.172 0.863695    
## monthOct:hourly_max_t   0.097286   0.101629   0.957 0.338716    
## monthSep:hourly_max_t  -0.109442   0.087587  -1.250 0.211834    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.966 on 813 degrees of freedom
## Multiple R-squared:  0.5732, Adjusted R-squared:  0.5569 
## F-statistic: 35.22 on 31 and 813 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_9)

## 
##  Breusch-Godfrey test for serial correlation of order up to 35
## 
## data:  Residuals
## LM test = 25.506, df = 35, p-value = 0.8801
plot(lm_hour_9)

Model Summary

Coefficients and Significance:

  • Lag 1 Production: Significant with a positive impact, indicating that the production from the previous hour significantly influences the current hour’s production.

  • Special Period: Not significant, suggesting minimal direct impact on the predictions for this hour.

  • DSWRF Surface: Significant and negatively impacting, indicating lower production with higher downward shortwave radiation at the surface.

  • CSnow Surface: Significant and negatively impacting, suggesting a decrease in production with increasing snow cover.

  • DLWRF Surface: Highly significant and negatively impacting, indicating that downward longwave radiation negatively affects production.

  • Is National Day: Marginally significant and positively impacting, suggesting slightly higher production on national days.

  • USWRF Surface: Significant and positively impacting, indicating higher production with increasing upward shortwave radiation at the surface.

  • Hourly Cloud Average: Significant and negatively impacting, suggesting lower production with higher cloud coverage.

Monthly Effects:

Some months showed significant interactions with hourly_max_t, highlighting seasonal variations in production. For example, December (Ara) and January (Oca) had notable interactions with temperature, impacting production levels.

Residuals and Diagnostics

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.5639, suggesting that the model explains about 56.39% of the variability in production.

  • Adjusted R-squared: 0.5476, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The WMAPE for hour 9 was found to be 19.51%, indicating a reasonably accurate model.

Visualization and Interpretation

Hourly Production Data for Hour 9: The production data for hour 9 is well-populated, showing clearer patterns as the day progresses.

Residuals Analysis:

  • Top Plot (Residuals over time): Displays the residuals over time, indicating periods of higher residuals and less accurate model predictions.

  • ACF Plot (Autocorrelation of Residuals): Shows some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

  • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 10 Analysis

For hour 10, the linear regression model included parameters such as lag_1_production, csnow_surface, dlwrf_surface, hourly_cloud_average, is.weekend, is.religousday, is.nationalday, and interactions between month and hourly_max_t.

# Plot production for hour 10
ggplot(hour_10_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 10",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Create a trend variable for hour 10
hour_10_data <- all_data[all_data$hour == 10, ]
hour_10_data <- hour_10_data[,-c(2)]
hour_10_data$trend_hour_10 <- 1:nrow(hour_10_data)
hour_10_data[, lag_1_production := shift(production,1)]
hour_10_data[,lag_1_diff:=production-lag_1_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_10_data <- hour_10_data[!is.na(lag_1_production)]
# Fit linear regression model for hour 10
lm_hour_10 <- lm(production ~+lag_1_production +CSNOW_surface+DLWRF_surface+hourly_cloud_average+is.weekend+is.religousday+is.nationalday+month*hourly_max_t, data = hour_10_data)
summary(lm_hour_10)
## 
## Call:
## lm(formula = production ~ +lag_1_production + CSNOW_surface + 
##     DLWRF_surface + hourly_cloud_average + is.weekend + is.religousday + 
##     is.nationalday + month * hourly_max_t, data = hour_10_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.9511 -0.6960  0.2747  1.0427  5.3855 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -1.917e+01  1.426e+01  -1.344 0.179270    
## lag_1_production       8.965e-02  2.806e-02   3.195 0.001454 ** 
## CSNOW_surface         -2.209e+00  5.742e-01  -3.847 0.000129 ***
## DLWRF_surface         -3.109e-02  4.798e-03  -6.481 1.58e-10 ***
## hourly_cloud_average  -2.250e-02  5.252e-03  -4.284 2.06e-05 ***
## is.weekend            -1.388e-01  1.564e-01  -0.887 0.375188    
## is.religousday         8.657e-01  4.385e-01   1.974 0.048696 *  
## is.nationalday         8.857e-01  5.634e-01   1.572 0.116309    
## monthAug              -7.828e+00  4.126e+01  -0.190 0.849571    
## monthDec              -1.390e+02  3.096e+01  -4.490 8.16e-06 ***
## monthFeb               2.690e+00  1.823e+01   0.148 0.882754    
## monthJan              -8.599e+01  1.956e+01  -4.397 1.25e-05 ***
## monthJul               1.212e+01  2.824e+01   0.429 0.667890    
## monthJun              -4.578e+01  2.890e+01  -1.584 0.113538    
## monthMar               9.752e+00  1.747e+01   0.558 0.576945    
## monthMay              -1.785e+01  2.130e+01  -0.838 0.402088    
## monthNov              -1.706e+01  2.146e+01  -0.795 0.426902    
## monthOct              -2.924e+01  2.730e+01  -1.071 0.284492    
## monthSep              -4.112e+00  2.326e+01  -0.177 0.859707    
## hourly_max_t           1.226e-01  4.964e-02   2.469 0.013744 *  
## monthAug:hourly_max_t  2.603e-02  1.339e-01   0.194 0.845904    
## monthDec:hourly_max_t  4.979e-01  1.099e-01   4.531 6.75e-06 ***
## monthFeb:hourly_max_t -6.664e-03  6.404e-02  -0.104 0.917149    
## monthJan:hourly_max_t  3.105e-01  6.912e-02   4.493 8.05e-06 ***
## monthJul:hourly_max_t -3.802e-02  9.375e-02  -0.406 0.685142    
## monthJun:hourly_max_t  1.533e-01  9.689e-02   1.582 0.113938    
## monthMar:hourly_max_t -3.196e-02  6.076e-02  -0.526 0.598998    
## monthMay:hourly_max_t  5.950e-02  7.247e-02   0.821 0.411856    
## monthNov:hourly_max_t  5.964e-02  7.458e-02   0.800 0.424128    
## monthOct:hourly_max_t  1.012e-01  9.313e-02   1.087 0.277522    
## monthSep:hourly_max_t  1.512e-02  7.789e-02   0.194 0.846119    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.04 on 814 degrees of freedom
## Multiple R-squared:  0.5499, Adjusted R-squared:  0.5333 
## F-statistic: 33.15 on 30 and 814 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_10)

## 
##  Breusch-Godfrey test for serial correlation of order up to 34
## 
## data:  Residuals
## LM test = 66.297, df = 34, p-value = 0.0007542
plot(lm_hour_10)

pacf(hour_10_data$production)

Model Summary:

  • Lag 1 Production: Significant with a positive impact, indicating that the previous hour’s production continues to influence the current hour.

  • Csnow Surface: Highly significant and negatively impacting, indicating reduced production during snowy conditions.

  • DLWRF Surface: Highly significant and negatively impacting, indicating reduced production with higher downward longwave radiation.

  • Hourly Cloud Average: Highly significant and negatively impacting, suggesting lower production with increased cloud cover.

  • Is Weekend: Not highly significant, indicating minimal impact during weekends.

  • Is Religious Day: Marginally significant with a positive impact, suggesting slightly higher production during religious holidays.

  • Is National Day: Not highly significant, indicating minimal impact during national holidays.

  • Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.5422, suggesting that the model explains about 54.22% of the variability in production.

  • Adjusted R-squared: 0.5257, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The WMAPE for hour 10 was found to be 17.13%, indicating a reasonably accurate model.

Visualization and Interpretation:

  • Hourly Production Data for Hour 10: The data shows significant fluctuations, indicating variability in production.

  • Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 11 Analysis

# Plot production for hour 11
ggplot(hour_11_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 11",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Create a trend variable for hour 11
hour_11_data <- all_data[all_data$hour == 11, ]
hour_11_data <- hour_11_data[,-c(2)]
hour_11_data$trend_hour_11 <- 1:nrow(hour_11_data)

hour_11_data[, lag_19_production := shift(production,19)]
hour_11_data[,lag_19_diff:=production-lag_19_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_11_data <- hour_11_data[!is.na(lag_19_production)]
# Fit linear regression model for hour 11
lm_hour_11 <- lm(production ~+lag_19_production+trend_hour_11+special_period+CSNOW_surface+DLWRF_surface+TMP_surface+hourly_cloud_average+is.weekend+is.religousday+is.nationalday+month*hourly_max_t, data = hour_11_data)
summary(lm_hour_11)
## 
## Call:
## lm(formula = production ~ +lag_19_production + trend_hour_11 + 
##     special_period + CSNOW_surface + DLWRF_surface + TMP_surface + 
##     hourly_cloud_average + is.weekend + is.religousday + is.nationalday + 
##     month * hourly_max_t, data = hour_11_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.8293 -0.6438  0.2980  1.0601  6.7454 
## 
## Coefficients: (1 not defined because of singularities)
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -2.648e+01  1.281e+01  -2.068 0.038998 *  
## lag_19_production     -6.935e-02  2.616e-02  -2.651 0.008194 ** 
## trend_hour_11         -7.087e-04  4.135e-04  -1.714 0.086916 .  
## special_period         3.271e-01  2.628e-01   1.245 0.213632    
## CSNOW_surface         -1.956e+00  5.801e-01  -3.373 0.000781 ***
## DLWRF_surface         -3.168e-02  4.589e-03  -6.903 1.04e-11 ***
## TMP_surface            1.518e-01  4.405e-02   3.446 0.000599 ***
## hourly_cloud_average  -1.728e-02  5.404e-03  -3.197 0.001444 ** 
## is.weekend            -1.295e-01  1.572e-01  -0.823 0.410494    
## is.religousday         1.882e-01  4.448e-01   0.423 0.672330    
## is.nationalday         3.911e-01  5.597e-01   0.699 0.484986    
## monthAug              -4.955e+01  4.017e+01  -1.233 0.217756    
## monthDec              -1.214e+02  2.656e+01  -4.570 5.66e-06 ***
## monthFeb               2.564e+01  1.699e+01   1.509 0.131646    
## monthJan              -8.820e+01  1.909e+01  -4.620 4.48e-06 ***
## monthJul              -8.063e+00  2.623e+01  -0.307 0.758577    
## monthJun              -1.023e+01  2.567e+01  -0.398 0.690463    
## monthMar               1.661e+01  1.596e+01   1.041 0.298382    
## monthMay              -3.758e+00  1.952e+01  -0.192 0.847411    
## monthNov              -3.104e+01  1.901e+01  -1.633 0.102942    
## monthOct              -1.058e+01  2.549e+01  -0.415 0.678376    
## monthSep              -1.605e+01  2.146e+01  -0.748 0.454965    
## hourly_max_t                  NA         NA      NA       NA    
## monthAug:hourly_max_t  1.578e-01  1.285e-01   1.228 0.219990    
## monthDec:hourly_max_t  4.319e-01  9.323e-02   4.633 4.22e-06 ***
## monthFeb:hourly_max_t -8.750e-02  5.924e-02  -1.477 0.140078    
## monthJan:hourly_max_t  3.162e-01  6.705e-02   4.715 2.85e-06 ***
## monthJul:hourly_max_t  2.752e-02  8.608e-02   0.320 0.749326    
## monthJun:hourly_max_t  3.347e-02  8.539e-02   0.392 0.695214    
## monthMar:hourly_max_t -5.515e-02  5.503e-02  -1.002 0.316592    
## monthMay:hourly_max_t  1.027e-02  6.580e-02   0.156 0.876002    
## monthNov:hourly_max_t  1.084e-01  6.541e-02   1.657 0.097897 .  
## monthOct:hourly_max_t  3.737e-02  8.590e-02   0.435 0.663686    
## monthSep:hourly_max_t  5.351e-02  7.098e-02   0.754 0.451115    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.03 on 794 degrees of freedom
## Multiple R-squared:  0.5389, Adjusted R-squared:  0.5204 
## F-statistic:    29 on 32 and 794 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_11)

## 
##  Breusch-Godfrey test for serial correlation of order up to 37
## 
## data:  Residuals
## LM test = 60.556, df = 37, p-value = 0.008602
plot(lm_hour_11)

  • Model Summary:

    • Lag 19 Production: Significant with a negative impact, indicating that the production 19 hours prior has a noticeable influence on the current hour.

    • Trend Hour 11: Not significant, suggesting no clear upward or downward trend in production over time for this hour.

    • Special Period: Not significant, indicating minimal impact of special periods on production during hour 11.

    • Csnow Surface: Highly significant and negatively impacting, indicating reduced production during snowy conditions.

    • DLWRF Surface: Highly significant and negatively impacting, indicating reduced production with higher downward longwave radiation.

    • TMP Surface: Significant with a positive impact, suggesting higher production with increased surface temperature.

    • Hourly Cloud Average: Significant and negatively impacting, indicating lower production with increased cloud cover.

    • Is Weekend: Not significant, indicating minimal impact during weekends.

    • Is Religious Day: Not significant, indicating minimal impact during religious holidays.

    • Is National Day: Not significant, indicating minimal impact during national holidays.

    • Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

    Residuals and Diagnostics:

    • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

    • Multiple R-squared: 0.5386, suggesting that the model explains about 53.86% of the variability in production.

    • Adjusted R-squared: 0.5204, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

    • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

    Visualization and Interpretation:

    • Hourly Production Data for Hour 11: The data shows significant fluctuations, indicating variability in production.

    • Residuals Analysis:

      • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

      • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

      • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 12 Analysis

ggplot(hour_12_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 12",
       x = "Date",
       y = "Production") +
  theme_minimal()


# Create a trend variable for hour 12
hour_12_data <- all_data[all_data$hour == 12, ]
hour_12_data <- hour_12_data[,-c(2)]
hour_12_data[, lag_1_production := shift(production,1)]
hour_12_data[,lag_1_diff:=production-lag_1_production]
hour_12_data <- hour_12_data[!is.na(lag_1_production)]
hour_12_data$trend_hour_12 <- 1:nrow(hour_12_data)
lm_hour_12 <- lm(production ~ +lag_1_production+trend_hour_12+CSNOW_surface+DLWRF_surface+TMP_surface+hourly_cloud_average+is.weekend+is.religousday+is.nationalday+month*hourly_max_t , data = hour_12_data)
summary(lm_hour_12)
## 
## Call:
## lm(formula = production ~ +lag_1_production + trend_hour_12 + 
##     CSNOW_surface + DLWRF_surface + TMP_surface + hourly_cloud_average + 
##     is.weekend + is.religousday + is.nationalday + month * hourly_max_t, 
##     data = hour_12_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.4179 -0.7232  0.2480  1.1310  6.5360 
## 
## Coefficients: (1 not defined because of singularities)
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -2.044e+01  1.229e+01  -1.663 0.096784 .  
## lag_1_production       9.585e-02  2.853e-02   3.360 0.000816 ***
## trend_hour_12         -6.164e-04  3.318e-04  -1.858 0.063583 .  
## CSNOW_surface         -1.382e+00  5.373e-01  -2.572 0.010293 *  
## DLWRF_surface         -3.028e-02  4.373e-03  -6.925 8.87e-12 ***
## TMP_surface            1.248e-01  4.206e-02   2.967 0.003100 ** 
## hourly_cloud_average  -1.783e-02  5.322e-03  -3.349 0.000847 ***
## is.weekend            -2.725e-01  1.539e-01  -1.771 0.076955 .  
## is.religousday         1.557e-01  4.376e-01   0.356 0.722122    
## is.nationalday         4.066e-01  5.543e-01   0.734 0.463386    
## monthAug              -2.198e+01  4.001e+01  -0.549 0.582889    
## monthDec              -6.447e+01  2.382e+01  -2.707 0.006934 ** 
## monthFeb               2.736e+01  1.575e+01   1.737 0.082741 .  
## monthJan              -6.111e+01  1.709e+01  -3.575 0.000371 ***
## monthJul               1.667e+01  2.441e+01   0.683 0.494851    
## monthJun              -5.513e+00  2.362e+01  -0.233 0.815540    
## monthMar               9.252e+00  1.488e+01   0.622 0.534191    
## monthMay               1.227e+01  1.842e+01   0.666 0.505469    
## monthNov              -2.384e+01  1.779e+01  -1.340 0.180513    
## monthOct              -1.661e+01  2.434e+01  -0.682 0.495177    
## monthSep              -1.128e+01  2.041e+01  -0.553 0.580642    
## hourly_max_t                  NA         NA      NA       NA    
## monthAug:hourly_max_t  7.162e-02  1.269e-01   0.564 0.572701    
## monthDec:hourly_max_t  2.275e-01  8.293e-02   2.743 0.006220 ** 
## monthFeb:hourly_max_t -9.554e-02  5.447e-02  -1.754 0.079774 .  
## monthJan:hourly_max_t  2.164e-01  5.950e-02   3.637 0.000293 ***
## monthJul:hourly_max_t -5.148e-02  7.957e-02  -0.647 0.517835    
## monthJun:hourly_max_t  1.842e-02  7.820e-02   0.236 0.813818    
## monthMar:hourly_max_t -2.977e-02  5.089e-02  -0.585 0.558761    
## monthMay:hourly_max_t -4.109e-02  6.177e-02  -0.665 0.506079    
## monthNov:hourly_max_t  8.153e-02  6.071e-02   1.343 0.179644    
## monthOct:hourly_max_t  5.646e-02  8.141e-02   0.694 0.488148    
## monthSep:hourly_max_t  3.811e-02  6.697e-02   0.569 0.569426    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.005 on 813 degrees of freedom
## Multiple R-squared:  0.5433, Adjusted R-squared:  0.5258 
## F-statistic: 31.19 on 31 and 813 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_12)

## 
##  Breusch-Godfrey test for serial correlation of order up to 36
## 
## data:  Residuals
## LM test = 37.429, df = 36, p-value = 0.4034
plot(lm_hour_12)

Model Summary:

  • Lag 1 Production: Significant with a positive impact, indicating that the production 1 hour prior has a noticeable influence on the current hour.

  • Trend Hour 12: Significant with a negative impact, suggesting a slight downward trend in production over time for this hour.

  • Csnow Surface: Significant with a negative impact, indicating reduced production during snowy conditions.

  • DLWRF Surface: Highly significant and negatively impacting, indicating reduced production with higher downward longwave radiation.

  • TMP Surface: Not significant, suggesting minimal impact of surface temperature on production during hour 12.

  • Hourly Cloud Average: Significant and negatively impacting, indicating lower production with increased cloud cover.

  • Is Weekend: Significant with a negative impact, indicating reduced production during weekends.

  • Is Religious Day: Not significant, indicating minimal impact during religious holidays.

  • Is National Day: Not significant, indicating minimal impact during national holidays.

  • Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.5443, suggesting that the model explains about 54.43% of the variability in production.

  • Adjusted R-squared: 0.5258, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

Visualization and Interpretation:

  • Hourly Production Data for Hour 12: The data shows significant fluctuations, indicating variability in production.

  • Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 13 Analysis

ggplot(hour_13_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 13",
       x = "Date",
       y = "Production") +
  theme_minimal()

# Create a trend variable for hour 13
hour_13_data <- all_data[all_data$hour == 13, ]
hour_13_data <- hour_13_data[,-c(2)]

hour_13_data[, lag_1_production := shift(production,1)]
hour_13_data[,lag_1_diff:=production-lag_1_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_13_data <- hour_13_data[!is.na(lag_1_production)]
# Remove rows with missing values in the hourly_max_t column
hour_13_data$trend_hour_13 <- 1:nrow(hour_13_data)
lm_hour_13 <- lm(production ~+lag_1_production +trend_hour_13+DLWRF_surface+TMP_surface+hourly_cloud_average+is.weekend+is.nationalday+is.publicholiday+month * hourly_max_t , data = hour_13_data)
summary(lm_hour_13)
## 
## Call:
## lm(formula = production ~ +lag_1_production + trend_hour_13 + 
##     DLWRF_surface + TMP_surface + hourly_cloud_average + is.weekend + 
##     is.nationalday + is.publicholiday + month * hourly_max_t, 
##     data = hour_13_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.7359 -1.0069  0.3289  1.2389  6.2663 
## 
## Coefficients: (1 not defined because of singularities)
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -3.231e+01  1.287e+01  -2.512 0.012212 *  
## lag_1_production       9.720e-02  2.816e-02   3.451 0.000587 ***
## trend_hour_13         -6.402e-04  3.405e-04  -1.880 0.060486 .  
## DLWRF_surface         -3.333e-02  4.420e-03  -7.539 1.26e-13 ***
## TMP_surface            1.659e-01  4.380e-02   3.789 0.000163 ***
## hourly_cloud_average  -2.490e-02  5.626e-03  -4.426 1.09e-05 ***
## is.weekend             3.515e+00  1.153e+00   3.049 0.002370 ** 
## is.nationalday         1.938e+00  7.691e-01   2.520 0.011932 *  
## is.publicholiday      -3.789e+00  1.164e+00  -3.255 0.001180 ** 
## monthAug              -5.345e+01  4.184e+01  -1.278 0.201715    
## monthDec              -3.358e+01  2.374e+01  -1.415 0.157499    
## monthFeb               2.762e+01  1.617e+01   1.709 0.087897 .  
## monthJan              -4.076e+01  1.753e+01  -2.325 0.020327 *  
## monthJul               2.489e+01  2.358e+01   1.055 0.291520    
## monthJun               8.255e+00  2.403e+01   0.344 0.731272    
## monthMar               1.101e+01  1.522e+01   0.724 0.469515    
## monthMay               2.711e+01  1.873e+01   1.448 0.148040    
## monthNov              -1.899e+01  1.855e+01  -1.024 0.306284    
## monthOct               2.044e+01  2.516e+01   0.813 0.416723    
## monthSep               9.972e+00  2.138e+01   0.466 0.641036    
## hourly_max_t                  NA         NA      NA       NA    
## monthAug:hourly_max_t  1.688e-01  1.323e-01   1.276 0.202312    
## monthDec:hourly_max_t  1.187e-01  8.230e-02   1.443 0.149513    
## monthFeb:hourly_max_t -9.251e-02  5.564e-02  -1.663 0.096762 .  
## monthJan:hourly_max_t  1.461e-01  6.073e-02   2.405 0.016393 *  
## monthJul:hourly_max_t -7.784e-02  7.678e-02  -1.014 0.310941    
## monthJun:hourly_max_t -2.549e-02  7.948e-02  -0.321 0.748495    
## monthMar:hourly_max_t -3.392e-02  5.183e-02  -0.655 0.512972    
## monthMay:hourly_max_t -8.854e-02  6.274e-02  -1.411 0.158518    
## monthNov:hourly_max_t  6.438e-02  6.312e-02   1.020 0.308030    
## monthOct:hourly_max_t -6.743e-02  8.395e-02  -0.803 0.422092    
## monthSep:hourly_max_t -3.043e-02  6.992e-02  -0.435 0.663570    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.096 on 814 degrees of freedom
## Multiple R-squared:  0.5608, Adjusted R-squared:  0.5447 
## F-statistic: 34.65 on 30 and 814 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_13)

## 
##  Breusch-Godfrey test for serial correlation of order up to 35
## 
## data:  Residuals
## LM test = 57.215, df = 35, p-value = 0.0103
plot(lm_hour_13)

Model Summary:

  • Lag 1 Production: Significant with a positive impact, indicating that the production 1 hour prior has a noticeable influence on the current hour.

  • Trend Hour 13: Marginally significant with a negative impact, suggesting a slight downward trend in production over time for this hour.

  • DLWRF Surface: Highly significant and negatively impacting, indicating reduced production with higher downward longwave radiation.

  • TMP Surface: Significant with a positive impact, suggesting higher production with increased surface temperature.

  • Hourly Cloud Average: Significant and negatively impacting, indicating lower production with increased cloud cover.

  • Is Weekend: Significant with a positive impact, indicating increased production during weekends.

  • Is National Day: Significant with a positive impact, indicating higher production during national holidays.

  • Is Public Holiday: Significant with a negative impact, indicating reduced production during public holidays.

  • Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  • Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  • Multiple R-squared: 0.5576, suggesting that the model explains about 55.76% of the variability in production.

  • Adjusted R-squared: 0.5417, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  • F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The Weighted Mean Absolute Percentage Error (WMAPE) for hour 13 is calculated as 23.93%, performance is considerably accurate.

Visualization and Interpretation:

  • Hourly Production Data for Hour 13: The data shows significant fluctuations, indicating variability in production.

  • Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 14 Analysis

ggplot(hour_14_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 14",
       x = "Date",
       y = "Production") +
  theme_minimal()

hour_14_data <- all_data[all_data$hour == 14, ]
hour_14_data <- hour_14_data[,-c(2)]
hour_14_data$trend_hour_14 <- 1:nrow(hour_14_data)
hour_14_data[, lag_14_production := shift(production,1)]
hour_14_data[,lag_14_diff:=production-lag_14_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_14_data <- hour_14_data[!is.na(lag_14_production)]
lm_hour_14 <- lm(production ~ +lag_14_production+ special_period+DLWRF_surface+TMP_surface+is.weekend+is.ramadan+is.religousday+is.nationalday+is.publicholiday+hourly_cloud_average+month*hourly_max_t , data = hour_14_data)
summary(lm_hour_14)
## 
## Call:
## lm(formula = production ~ +lag_14_production + special_period + 
##     DLWRF_surface + TMP_surface + is.weekend + is.ramadan + is.religousday + 
##     is.nationalday + is.publicholiday + hourly_cloud_average + 
##     month * hourly_max_t, data = hour_14_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.488 -1.097  0.228  1.213  6.766 
## 
## Coefficients: (1 not defined because of singularities)
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -1.732e+01  1.263e+01  -1.371  0.17076    
## lag_14_production      1.175e-01  2.881e-02   4.078 4.99e-05 ***
## special_period         3.980e-01  2.180e-01   1.826  0.06827 .  
## DLWRF_surface         -2.542e-02  4.290e-03  -5.925 4.62e-09 ***
## TMP_surface            1.055e-01  4.318e-02   2.443  0.01478 *  
## is.weekend             3.513e+00  1.102e+00   3.187  0.00149 ** 
## is.ramadan             2.437e-01  3.645e-01   0.669  0.50395    
## is.religousday         9.014e-03  4.318e-01   0.021  0.98335    
## is.nationalday         1.739e+00  7.352e-01   2.366  0.01824 *  
## is.publicholiday      -3.673e+00  1.113e+00  -3.300  0.00101 ** 
## hourly_cloud_average  -3.190e-02  5.417e-03  -5.888 5.70e-09 ***
## monthAug              -3.026e+01  3.544e+01  -0.854  0.39350    
## monthDec              -1.221e+01  2.328e+01  -0.524  0.60017    
## monthFeb               1.662e+01  1.543e+01   1.077  0.28182    
## monthJan              -3.522e+01  1.672e+01  -2.107  0.03543 *  
## monthJul               2.397e+01  2.228e+01   1.076  0.28232    
## monthJun              -3.196e+01  2.370e+01  -1.348  0.17788    
## monthMar               1.270e+01  1.464e+01   0.867  0.38602    
## monthMay               2.647e+01  1.855e+01   1.427  0.15400    
## monthNov              -6.076e+00  1.820e+01  -0.334  0.73856    
## monthOct               2.077e+01  2.454e+01   0.846  0.39762    
## monthSep               2.979e-01  2.111e+01   0.014  0.98874    
## hourly_max_t                  NA         NA      NA       NA    
## monthAug:hourly_max_t  9.717e-02  1.126e-01   0.863  0.38857    
## monthDec:hourly_max_t  3.702e-02  8.086e-02   0.458  0.64717    
## monthFeb:hourly_max_t -5.687e-02  5.313e-02  -1.070  0.28475    
## monthJan:hourly_max_t  1.226e-01  5.795e-02   2.116  0.03467 *  
## monthJul:hourly_max_t -7.447e-02  7.275e-02  -1.024  0.30635    
## monthJun:hourly_max_t  1.060e-01  7.868e-02   1.347  0.17844    
## monthMar:hourly_max_t -4.131e-02  4.983e-02  -0.829  0.40732    
## monthMay:hourly_max_t -8.723e-02  6.239e-02  -1.398  0.16244    
## monthNov:hourly_max_t  1.655e-02  6.213e-02   0.266  0.79002    
## monthOct:hourly_max_t -7.260e-02  8.215e-02  -0.884  0.37713    
## monthSep:hourly_max_t -5.273e-04  6.924e-02  -0.008  0.99393    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.007 on 812 degrees of freedom
## Multiple R-squared:  0.5797, Adjusted R-squared:  0.5632 
## F-statistic: 35.01 on 32 and 812 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_14)

## 
##  Breusch-Godfrey test for serial correlation of order up to 37
## 
## data:  Residuals
## LM test = 32.917, df = 37, p-value = 0.6609
plot(lm_hour_14)

Model Summary:

  1. Lag 14 Production: Significant with a positive impact, indicating that the production 1 hour prior has a noticeable influence on the current hour.

  2. Special Period: Marginally significant with a positive impact, suggesting that special periods might slightly increase production.

  3. DLWRF Surface: Highly significant and negatively impacting, indicating reduced production with higher downward longwave radiation.

  4. TMP Surface: Significant with a positive impact, suggesting higher production with increased surface temperature.

  5. Is Weekend: Significant with a positive impact, indicating increased production during weekends.

  6. Is National Day: Significant with a positive impact, indicating higher production during national holidays.

  7. Is Public Holiday: Significant with a negative impact, indicating reduced production during public holidays.

  8. Hourly Cloud Average: Significant and negatively impacting, indicating lower production with increased cloud cover.

  9. Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  1. Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  2. Multiple R-squared: 0.5738, suggesting that the model explains about 57.38% of the variability in production.

  3. Adjusted R-squared: 0.5574, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  4. F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The Weighted Mean Absolute Percentage Error (WMAPE) for hour 14 is calculated as 25.24%, considerably well.

Visualization and Interpretation:

  1. Hourly Production Data for Hour 14: The data shows significant fluctuations, indicating variability in production.

  2. Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 15 Analysis

ggplot(hour_15_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 15",
       x = "Date",
       y = "Production") +
  theme_minimal()

hour_15_data <- all_data[all_data$hour == 15, ]
hour_15_data <- hour_15_data[,-c(2)]
hour_15_data$trend_hour_15 <- 1:nrow(hour_15_data)
hour_15_data[, lag_15_production := shift(production,1)]
hour_15_data[,lag_15_diff:=production-lag_15_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_15_data <- hour_15_data[!is.na(lag_15_production)]
lm_hour_15 <- lm(production ~+lag_15_production+trend_hour_15+special_period+DSWRF_surface+USWRF_top_of_atmosphere+DLWRF_surface+hourly_cloud_average+TMP_surface+is.weekend+is.ramadan+is.publicholiday +month*hourly_max_t , data = hour_15_data)
summary(lm_hour_15)
## 
## Call:
## lm(formula = production ~ +lag_15_production + trend_hour_15 + 
##     special_period + DSWRF_surface + USWRF_top_of_atmosphere + 
##     DLWRF_surface + hourly_cloud_average + TMP_surface + is.weekend + 
##     is.ramadan + is.publicholiday + month * hourly_max_t, data = hour_15_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.2078 -0.8555  0.0800  0.9873  6.1980 
## 
## Coefficients: (1 not defined because of singularities)
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -4.692e+01  1.267e+01  -3.705 0.000226 ***
## lag_15_production        1.144e-01  2.990e-02   3.827 0.000140 ***
## trend_hour_15           -7.582e-04  3.227e-04  -2.349 0.019046 *  
## special_period           4.856e-01  2.105e-01   2.307 0.021293 *  
## DSWRF_surface            3.095e-03  1.333e-03   2.323 0.020444 *  
## USWRF_top_of_atmosphere  1.062e-02  2.017e-03   5.268 1.77e-07 ***
## DLWRF_surface           -2.499e-02  5.065e-03  -4.933 9.81e-07 ***
## hourly_cloud_average    -2.648e-02  4.689e-03  -5.647 2.26e-08 ***
## TMP_surface              1.857e-01  4.515e-02   4.113 4.31e-05 ***
## is.weekend               1.134e+00  6.891e-01   1.645 0.100321    
## is.ramadan               3.829e-01  2.979e-01   1.285 0.199111    
## is.publicholiday        -1.400e+00  6.843e-01  -2.045 0.041138 *  
## monthAug                 1.587e+01  2.506e+01   0.633 0.526879    
## monthDec                 2.097e+01  2.065e+01   1.016 0.310038    
## monthFeb                -2.971e+00  1.339e+01  -0.222 0.824463    
## monthJan                -2.962e+01  1.485e+01  -1.994 0.046524 *  
## monthJul                 9.355e+00  1.868e+01   0.501 0.616687    
## monthJun                -4.056e+01  1.898e+01  -2.137 0.032911 *  
## monthMar                 1.168e+01  1.288e+01   0.907 0.364749    
## monthMay                -1.450e+00  1.639e+01  -0.088 0.929560    
## monthNov                 3.102e+01  1.598e+01   1.942 0.052522 .  
## monthOct                 2.315e+01  2.085e+01   1.110 0.267157    
## monthSep                 1.328e+00  1.830e+01   0.073 0.942168    
## hourly_max_t                    NA         NA      NA       NA    
## monthAug:hourly_max_t   -5.224e-02  8.057e-02  -0.648 0.516936    
## monthDec:hourly_max_t   -7.230e-02  7.216e-02  -1.002 0.316642    
## monthFeb:hourly_max_t    1.553e-02  4.650e-02   0.334 0.738468    
## monthJan:hourly_max_t    1.108e-01  5.227e-02   2.120 0.034290 *  
## monthJul:hourly_max_t   -3.180e-02  6.132e-02  -0.519 0.604176    
## monthJun:hourly_max_t    1.338e-01  6.344e-02   2.109 0.035283 *  
## monthMar:hourly_max_t   -3.846e-02  4.421e-02  -0.870 0.384621    
## monthMay:hourly_max_t    6.541e-03  5.544e-02   0.118 0.906109    
## monthNov:hourly_max_t   -1.078e-01  5.469e-02  -1.971 0.049057 *  
## monthOct:hourly_max_t   -8.094e-02  7.015e-02  -1.154 0.248912    
## monthSep:hourly_max_t   -6.903e-03  6.036e-02  -0.114 0.908975    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.636 on 811 degrees of freedom
## Multiple R-squared:  0.6816, Adjusted R-squared:  0.6686 
## F-statistic:  52.6 on 33 and 811 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_15)

## 
##  Breusch-Godfrey test for serial correlation of order up to 38
## 
## data:  Residuals
## LM test = 56.797, df = 38, p-value = 0.02551
plot(lm_hour_15)

Model Summary:

  1. Lag 15 Production: Significant with a positive impact, indicating that the production 15 hours prior has a noticeable influence on the current hour.

  2. Trend Hour 15: Significant with a negative impact, suggesting a slight downward trend over time.

  3. Special Period: Significant with a positive impact, indicating that special periods might slightly increase production.

  4. DSWRF Surface: Significant with a positive impact, suggesting increased production with higher downward shortwave radiation.

  5. USWRF Top of Atmosphere: Highly significant with a positive impact, indicating higher production with increased upward shortwave radiation.

  6. DLWRF Surface: Highly significant with a negative impact, indicating reduced production with higher downward longwave radiation.

  7. Hourly Cloud Average: Highly significant with a negative impact, indicating lower production with increased cloud cover.

  8. TMP Surface: Marginally significant with a positive impact, suggesting higher production with increased surface temperature.

  9. Is Weekend: Marginally significant with a positive impact, indicating increased production during weekends.

  10. Is Ramadan: Marginally significant with a positive impact, suggesting slightly higher production during Ramadan.

  11. Is Public Holiday: Significant with a negative impact, indicating reduced production during public holidays.

  12. Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  1. Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  2. Multiple R-squared: 0.6737, suggesting that the model explains about 67.37% of the variability in production.

  3. Adjusted R-squared: 0.6608, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  4. F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The Weighted Mean Absolute Percentage Error (WMAPE) for hour 15 is calculated as 31.63%, considerably well.

Visualization and Interpretation:

  1. Hourly Production Data for Hour 15: The data shows significant fluctuations, indicating variability in production.

  2. Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 16 Analysis

ggplot(hour_16_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 16",
       x = "Date",
       y = "Production") +
  theme_minimal()

hour_16_data <- all_data[all_data$hour == 16, ]
hour_16_data <- hour_16_data[,-c(2)]
hour_16_data$trend_hour_16 <- 1:nrow(hour_16_data)
hour_16_data[, lag_16_production := shift(production,1)]
hour_16_data[,lag_16_diff:=production-lag_16_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_16_data <- hour_16_data[!is.na(lag_16_production)]
lm_hour_16 <- lm(production ~+lag_16_production+trend_hour_16+ special_period+DSWRF_surface+USWRF_top_of_atmosphere+hourly_cloud_average+is.ramadan+month*hourly_max_t, data = hour_16_data)
summary(lm_hour_16)
## 
## Call:
## lm(formula = production ~ +lag_16_production + trend_hour_16 + 
##     special_period + DSWRF_surface + USWRF_top_of_atmosphere + 
##     hourly_cloud_average + is.ramadan + month * hourly_max_t, 
##     data = hour_16_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.2672 -0.6417 -0.0248  0.5744  8.0767 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -1.642e+01  9.709e+00  -1.691  0.09127 .  
## lag_16_production        2.812e-01  3.206e-02   8.771  < 2e-16 ***
## trend_hour_16           -7.781e-04  2.578e-04  -3.018  0.00263 ** 
## special_period           5.210e-01  1.706e-01   3.053  0.00234 ** 
## DSWRF_surface            5.978e-03  1.043e-03   5.732  1.4e-08 ***
## USWRF_top_of_atmosphere  5.059e-03  1.644e-03   3.078  0.00216 ** 
## hourly_cloud_average    -7.726e-03  3.161e-03  -2.445  0.01471 *  
## is.ramadan               4.665e-01  2.410e-01   1.936  0.05323 .  
## monthAug                 1.889e+01  1.999e+01   0.945  0.34481    
## monthDec                 7.248e+00  1.833e+01   0.395  0.69268    
## monthFeb                -6.156e+00  1.149e+01  -0.536  0.59220    
## monthJan                -1.415e+01  1.269e+01  -1.115  0.26523    
## monthJul                -1.221e+00  1.635e+01  -0.075  0.94048    
## monthJun                -1.114e+01  1.600e+01  -0.696  0.48650    
## monthMar                 7.008e+00  1.108e+01   0.633  0.52723    
## monthMay                 1.781e+01  1.450e+01   1.228  0.21965    
## monthNov                 3.108e+01  1.427e+01   2.178  0.02968 *  
## monthOct                 3.476e+01  1.779e+01   1.954  0.05108 .  
## monthSep                 5.575e+00  1.619e+01   0.344  0.73060    
## hourly_max_t             4.936e-02  3.289e-02   1.501  0.13378    
## monthAug:hourly_max_t   -6.132e-02  6.497e-02  -0.944  0.34556    
## monthDec:hourly_max_t   -2.120e-02  6.462e-02  -0.328  0.74294    
## monthFeb:hourly_max_t    2.485e-02  4.013e-02   0.619  0.53597    
## monthJan:hourly_max_t    5.522e-02  4.490e-02   1.230  0.21915    
## monthJul:hourly_max_t    1.735e-03  5.405e-02   0.032  0.97440    
## monthJun:hourly_max_t    3.589e-02  5.381e-02   0.667  0.50495    
## monthMar:hourly_max_t   -2.419e-02  3.829e-02  -0.632  0.52768    
## monthMay:hourly_max_t   -5.953e-02  4.935e-02  -1.206  0.22805    
## monthNov:hourly_max_t   -1.057e-01  4.925e-02  -2.146  0.03214 *  
## monthOct:hourly_max_t   -1.186e-01  6.042e-02  -1.964  0.04990 *  
## monthSep:hourly_max_t   -1.988e-02  5.387e-02  -0.369  0.71226    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.325 on 814 degrees of freedom
## Multiple R-squared:  0.689,  Adjusted R-squared:  0.6775 
## F-statistic:  60.1 on 30 and 814 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_16)

## 
##  Breusch-Godfrey test for serial correlation of order up to 34
## 
## data:  Residuals
## LM test = 77.024, df = 34, p-value = 3.504e-05
plot(lm_hour_16)

Model Summary:

  1. Lag 16 Production: Highly significant with a positive impact, indicating that the production 16 hours prior strongly influences the current hour.

  2. Trend Hour 16: Significant with a negative impact, suggesting a slight downward trend over time.

  3. Special Period: Significant with a positive impact, indicating that special periods might slightly increase production.

  4. DSWRF Surface: Highly significant with a positive impact, suggesting increased production with higher downward shortwave radiation.

  5. USWRF Top of Atmosphere: Significant with a positive impact, indicating higher production with increased upward shortwave radiation.

  6. Hourly Cloud Average: Significant with a negative impact, indicating lower production with increased cloud cover.

  7. Is Ramadan: Significant with a positive impact, suggesting slightly higher production during Ramadan.

  8. Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  1. Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  2. Multiple R-squared: 0.6828, suggesting that the model explains about 68.28% of the variability in production.

  3. Adjusted R-squared: 0.6714, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  4. F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The Weighted Mean Absolute Percentage Error (WMAPE) for hour 16 is calculated as 39.61%, considerably well.

Visualization xand Interpretation:

  1. Hourly Production Data for Hour 16: The data shows significant fluctuations, indicating variability in production.

  2. Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 17 Analysis

Log transformation for hour 17 was considered. However, we have decided not to use it since it requires elimination of zero values and disrupts continuity of our data.

ggplot(hour_17_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 17",
       x = "Date",
       y = "Production") +
  theme_minimal()

hour_17_data <- all_data[all_data$hour == 17, ]
hour_17_data <- hour_17_data[,-c(2)]
hour_17_data$trend_hour_17 <- 1:nrow(hour_17_data)
hour_17_data$nm <- as.numeric(format(hour_17_data$date, "%m"))
# Assuming your date column is named 'date_column'
hour_17_data$is_not_winter <- as.numeric(!hour_17_data$nm %in% c(12, 1, 2))
hour_17_data <- as.data.table(hour_17_data)
#hour_17_data <- hour_17_data[production > 0]
#hour_17_data[, log_production := log(production)]
hour_17_data[, lag_17_production := shift(production,1)]
hour_17_data[,lag_17_diff:=production-lag_17_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_17_data <- hour_17_data[!is.na(lag_17_production)]
hour_17_data <- hour_17_data[!is.na(lag_17_diff)]
lm_hour_17 <- lm(production ~+lag_17_production+trend_hour_17+special_period+is.ramadan+is.religousday+USWRF_surface+USWRF_top_of_atmosphere+DSWRF_surface+month*hourly_max_t, data = hour_17_data)
summary(lm_hour_17)
## 
## Call:
## lm(formula = production ~ +lag_17_production + trend_hour_17 + 
##     special_period + is.ramadan + is.religousday + USWRF_surface + 
##     USWRF_top_of_atmosphere + DSWRF_surface + month * hourly_max_t, 
##     data = hour_17_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.2139 -0.2809 -0.0096  0.1888  4.0585 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -2.306e+00  6.176e+00  -0.373   0.7090    
## lag_17_production        5.039e-01  3.057e-02  16.486  < 2e-16 ***
## trend_hour_17           -6.348e-04  1.585e-04  -4.006 6.75e-05 ***
## special_period           2.517e-01  9.986e-02   2.521   0.0119 *  
## is.ramadan               2.729e-01  1.400e-01   1.950   0.0516 .  
## is.religousday          -1.436e-01  1.685e-01  -0.852   0.3943    
## USWRF_surface           -4.230e-03  2.670e-03  -1.584   0.1135    
## USWRF_top_of_atmosphere  1.182e-03  1.181e-03   1.001   0.3170    
## DSWRF_surface            3.048e-03  1.289e-03   2.364   0.0183 *  
## monthAug                 2.927e-01  1.256e+01   0.023   0.9814    
## monthDec                 3.770e-02  1.107e+01   0.003   0.9973    
## monthFeb                 6.113e+00  8.106e+00   0.754   0.4510    
## monthJan                -6.389e-01  7.768e+00  -0.082   0.9345    
## monthJul                -8.086e+00  1.075e+01  -0.752   0.4521    
## monthJun                 1.262e+01  1.001e+01   1.261   0.2077    
## monthMar                 8.033e+00  7.692e+00   1.044   0.2966    
## monthMay                 1.500e+01  9.482e+00   1.582   0.1140    
## monthNov                 6.433e+00  9.207e+00   0.699   0.4849    
## monthOct                 2.789e+00  1.125e+01   0.248   0.8043    
## monthSep                 9.095e+00  1.017e+01   0.895   0.3713    
## hourly_max_t             6.029e-03  2.146e-02   0.281   0.7788    
## monthAug:hourly_max_t   -7.223e-05  4.135e-02  -0.002   0.9986    
## monthDec:hourly_max_t    1.846e-03  3.946e-02   0.047   0.9627    
## monthFeb:hourly_max_t   -2.127e-02  2.834e-02  -0.751   0.4531    
## monthJan:hourly_max_t    3.744e-03  2.748e-02   0.136   0.8917    
## monthJul:hourly_max_t    2.613e-02  3.588e-02   0.728   0.4667    
## monthJun:hourly_max_t   -4.251e-02  3.399e-02  -1.250   0.2115    
## monthMar:hourly_max_t   -2.874e-02  2.677e-02  -1.074   0.2832    
## monthMay:hourly_max_t   -5.038e-02  3.256e-02  -1.547   0.1222    
## monthNov:hourly_max_t   -2.148e-02  3.213e-02  -0.669   0.5039    
## monthOct:hourly_max_t   -9.678e-03  3.861e-02  -0.251   0.8022    
## monthSep:hourly_max_t   -3.093e-02  3.428e-02  -0.902   0.3672    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7684 on 813 degrees of freedom
## Multiple R-squared:  0.679,  Adjusted R-squared:  0.6668 
## F-statistic: 55.48 on 31 and 813 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_17)

## 
##  Breusch-Godfrey test for serial correlation of order up to 35
## 
## data:  Residuals
## LM test = 144.34, df = 35, p-value = 3.151e-15
plot(lm_hour_17)

Model Summary:

  1. Lag 17 Production: Highly significant with a positive impact, indicating that the production 17 hours prior strongly influences the current hour.

  2. Trend Hour 17: Significant with a negative impact, suggesting a slight downward trend over time.

  3. Special Period: Significant with a positive impact, indicating that special periods might slightly increase production.

  4. Is Ramadan: Significant with a positive impact, suggesting slightly higher production during Ramadan.

  5. USWRF Surface: Marginally significant with a negative impact, indicating a slight decrease in production with increased upward shortwave radiation at the surface.

  6. DSWRF Surface: Significant with a positive impact, suggesting increased production with higher downward shortwave radiation.

  7. Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  1. Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  2. Multiple R-squared: 0.6759, suggesting that the model explains about 67.59% of the variability in production.

  3. Adjusted R-squared: 0.6638, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  4. F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The Weighted Mean Absolute Percentage Error (WMAPE) for hour 17 is calculated as 61.69%, considerably high error.

Visualization and Interpretation:

  1. Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.

Hour 18 Analysis

ggplot(hour_18_data, aes(x = date, y = production)) +
  geom_line() +
  labs(title = "Hourly Production Data for Hour 18",
       x = "Date",
       y = "Production") +
  theme_minimal()

hour_18_data <- all_data[all_data$hour == 18, ]
hour_18_data <- hour_18_data[,-c(2)]
# Assuming your dataframe is named 'data' and the production column is named 'production'
#data_18_filtered <- hour_18_data[hour_18_data$production != 0, ]
hour_18_data$trend_hour_18 <- 1:nrow(hour_18_data)
hour_18_data[, lag_18_production := shift(production,1)]
hour_18_data[,lag_18_diff:=production-lag_18_production]
# Remove rows with NA in lagged production to ensure the model can run
hour_18_data <- hour_18_data[!is.na(lag_18_production)]
lm_hour_18 <- lm(production ~+lag_18_production +trend_hour_18+special_period+TMP_surface+DSWRF_surface+month*hourly_max_t  , data = hour_18_data)
summary(lm_hour_18)
## 
## Call:
## lm(formula = production ~ +lag_18_production + trend_hour_18 + 
##     special_period + TMP_surface + DSWRF_surface + month * hourly_max_t, 
##     data = hour_18_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.7237 -0.0471 -0.0031  0.0060  3.7605 
## 
## Coefficients: (1 not defined because of singularities)
##                         Estimate Std. Error t value Pr(>|t|)   
## (Intercept)           -1.021e-01  2.286e+00  -0.045  0.96438   
## lag_18_production      1.040e-01  3.482e-02   2.987  0.00290 **
## trend_hour_18          2.088e-05  5.039e-05   0.414  0.67866   
## special_period         9.697e-02  3.446e-02   2.814  0.00500 **
## TMP_surface            3.467e-04  8.013e-03   0.043  0.96550   
## DSWRF_surface         -6.681e-06  1.368e-04  -0.049  0.96107   
## monthAug              -1.342e+01  4.901e+00  -2.738  0.00632 **
## monthDec               1.598e-01  3.799e+00   0.042  0.96647   
## monthFeb               2.223e-01  2.628e+00   0.085  0.93261   
## monthJan               2.719e-01  2.754e+00   0.099  0.92137   
## monthJul              -6.414e+00  4.419e+00  -1.452  0.14698   
## monthJun               2.336e+00  4.055e+00   0.576  0.56485   
## monthMar               2.718e-01  2.658e+00   0.102  0.91857   
## monthMay              -1.572e+00  3.689e+00  -0.426  0.67009   
## monthNov               4.387e-02  3.569e+00   0.012  0.99020   
## monthOct              -6.391e-01  4.132e+00  -0.155  0.87710   
## monthSep               2.067e-01  3.740e+00   0.055  0.95594   
## hourly_max_t                  NA         NA      NA       NA   
## monthAug:hourly_max_t  4.489e-02  1.637e-02   2.742  0.00624 **
## monthDec:hourly_max_t -5.943e-04  1.362e-02  -0.044  0.96520   
## monthFeb:hourly_max_t -8.148e-04  9.300e-03  -0.088  0.93020   
## monthJan:hourly_max_t -9.999e-04  9.801e-03  -0.102  0.91876   
## monthJul:hourly_max_t  2.228e-02  1.491e-02   1.494  0.13545   
## monthJun:hourly_max_t -7.369e-03  1.389e-02  -0.530  0.59595   
## monthMar:hourly_max_t -9.833e-04  9.352e-03  -0.105  0.91629   
## monthMay:hourly_max_t  5.688e-03  1.279e-02   0.445  0.65657   
## monthNov:hourly_max_t -2.662e-04  1.263e-02  -0.021  0.98319   
## monthOct:hourly_max_t  2.042e-03  1.442e-02   0.142  0.88747   
## monthSep:hourly_max_t -8.880e-04  1.282e-02  -0.069  0.94478   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2704 on 817 degrees of freedom
## Multiple R-squared:  0.1987, Adjusted R-squared:  0.1723 
## F-statistic: 7.506 on 27 and 817 DF,  p-value: < 2.2e-16
checkresiduals(lm_hour_18)

## 
##  Breusch-Godfrey test for serial correlation of order up to 32
## 
## data:  Residuals
## LM test = 187.64, df = 32, p-value < 2.2e-16
plot(lm_hour_18)

Model Summary:

  1. Lag 18 Production: Significant with a positive impact, indicating that the production 18 hours prior influences the current hour.

  2. Trend Hour 18: Not significant, suggesting the trend does not have a noticeable effect.

  3. Special Period: Significant with a positive impact, indicating that special periods might slightly increase production.

  4. Tmp Surface: Significant with a positive impact, suggesting that higher temperatures increase production.

  5. DSWRF Surface: Not significant, indicating it does not significantly affect production.

  6. Monthly Effects: Some months showed significant interactions with hourly_max_t, indicating seasonal variations in production.

Residuals and Diagnostics:

  1. Residual Standard Error: Indicates the variability in the residuals or prediction errors.

  2. Multiple R-squared: 0.1977, suggesting that the model explains about 19.77% of the variability in production.

  3. Adjusted R-squared: 0.1719, slightly lower than Multiple R-squared, accounting for the number of predictors in the model.

  4. F-statistic: Significant, indicating that the model provides a better fit than a model with no predictors.

The Weighted Mean Absolute Percentage Error (WMAPE) for hour 18 is calculated as 87.70%, very high error.

Visualization and Interpretation:

  1. Residuals Analysis:

    • Top Plot (Residuals over time): Shows periods of higher residuals, indicating times when the model predictions were less accurate.

    • ACF Plot (Autocorrelation of Residuals): Indicates some autocorrelation, suggesting that not all patterns in the data are fully captured by the model.

    • Histogram (Distribution of Residuals): Centered around zero but with some deviations, indicating areas where predictions might be off.